After completing this lesson, you’ll be able to:
This course focuses on optimizing the performance of your workspaces. It contains specific advice for reading, writing, transformers, and FME Server.
The most important method to improve reading performance is to minimize the amount of data that is being read. As already mentioned, reading excess features counts as unnecessary work and is therefore inefficient. In many workspaces, read features are preserved through most of the workflow. This fact means that every transformer can potentially run on all the input data, making it vital that you remove unnecessary data from your workflow as early as possible.
For example, this workspace reads nearly 14,000 features but immediately discards all except 419 of them (ones where the owner's name begins with "C"):
In this scenario, if possible, it would be much more efficient to simply just read those approximately 400 features. Not only does it avoid reading unnecessary data, but it also avoids caching it twice over!
Fortunately, all formats have various sets of parameters that speed up feature reading, by filtering the amount of data being read:
The first of these – search envelope – defines the data to read as a geographic area. Then only that area of data needs to be read. These parameters are available on every spatial data reader but have the most effect when the source data is spatially indexed. Then the query is being carried out at its most efficient.
Similarly, there are a number of parameters designed to let the user define how many features to read. These parameters include the ability to define a maximum number of features to read, and what features to start with. There is also a parameter that defines which feature types (layers or tables) should be read.
By using these judiciously, the amount of data being read can be reduced and the translation sped up. For example, if we knew that the first records in the dataset were the ones beginning with "C", we could set Max Features to Read to 419.
Other formats – particularly databases – have additional clauses that can help reduce the data flow:
Here, for example, this Geodatabase reader has a ‘WHERE Clause’ parameter that applies the "owner name begins with 'C' test" in a way that is more efficient than reading the entire contents of a large table and using a Tester transformer.
Another potential bottleneck - specifically for formats with a table list – is the case where you have more feature types than are necessary.
Here the user has added two tables to their Geodatabase reader:
However, if you look at the workspace you'll find that the PostcodeBoundaries table is not connected to anything. The unconnected table is still being read - and cached - but the data is being ignored:
Presumably, the user added the tables for some reason, but then decided they did not need them, In that case, they should delete the feature type from the FME workspace. Then the table will not be read and performance will improve.
When developing a workspace, it is easy to lose Feature Types that are unused, especially once the workspace grows in size. To quickly remove these unused Feature Types go to Tools > Remove > Unattached... in the menu bar.
Obviously, this tool is less useful when there is just one unattached item but is more useful in a larger workspace with an unknown number of unattached objects.